ClaSP: An Efficient Algorithm for Mining Frequent Closed Sequences
نویسندگان
چکیده
In this paper, we propose a new algorithm, called ClaSP for mining frequent closed sequential patterns in temporal transaction data. Our algorithm uses several efficient search space pruning methods together with a vertical database layout. Experiments on both synthetic and real datasets show that ClaSP outperforms currently well known state of the art methods, such as CloSpan.
منابع مشابه
High Fuzzy Utility Based Frequent Patterns Mining Approach for Mobile Web Services Sequences
Nowadays high fuzzy utility based pattern mining is an emerging topic in data mining. It refers to discover all patterns having a high utility meeting a user-specified minimum high utility threshold. It comprises extracting patterns which are highly accessed in mobile web service sequences. Different from the traditional fuzzy approach, high fuzzy utility mining considers not only counts of mob...
متن کاملEfficient Ming of Top-K Closed Sequences
Sequence mining is an important data mining task. In order to retrieve interesting sequences from a large database, a minimum support threshold is needed to be specified. Unfortunately, specification of the appropriated support threshold is very difficult for users who are novice to mining queries and task specific data. To avoid this difficulty of specification of the appropriated support thre...
متن کاملExtracting Feature Sequences in Software Vulnerabilities Based on Closed Sequential Pattern Mining
Feature Extraction is significant for determining security vulnerabilities in software. Mining closed sequential patterns provides complete and condensed information for non-redundant frequent sequences generation. In this paper, we discuss the feature interaction problem and propose an efficient algorithm to extract features in vulnerability sequences. Each closed sequential pattern represents...
متن کاملFast Vertical Mining of Sequential Patterns Using Co-occurrence Information
Sequential pattern mining algorithms using a vertical representation are the most efficient for mining sequential patterns in dense or long sequences, and have excellent overall performance. The vertical representation allows generating patterns and calculating their supports without performing costly database scans. However, a crucial performance bottleneck of vertical algorithms is that they ...
متن کاملMining Closed Episodes from Event Sequences Efficiently
Recent studies have proposed different methods for mining frequent episodes. In this work, we study the problem of mining closed episodes based on minimal occurrences. We study the properties of minimal occurrences and design effective pruning techniques to prune non-closed episodes. An efficient mining algorithm Clo_episode is proposed to mine all closed episodes following a breadth-first sear...
متن کامل